Incorporating User Behavior Information in IR Evaluation
نویسندگان
چکیده
Many evaluation measures in Information Retrieval (IR) can be viewed as simple user models. Meanwhile, search logs provide us with information about how real users search. This paper describes our attempts to reconcile click log information with user-centric IR measures, bringing the measures into agreement with the logs. Studying the discount curve of NDCG and RBP leads us to extend them, incorporating the probability of click in their discount curves. We measure accuracy of user models by calculating ‘session likelihood’. This leads us to propose a new IR evaluation measure, Expected Browsing Utility (EBU), based on a more sophisticated user model. EBU has better session likelihood than existing measures, therefore we argue it is a better user-centric IR measure.
منابع مشابه
Reflections on Information Retrieval Evaluation
Information retrieval (IR) research has primarily consisted of two paradigms: systemsoriented research and user studies. Systems-oriented research includes IR algorithm development and evaluation and, to some degree, human-system interaction. User studies include human information behavior and information seeking research. These paradigms have contributed new IR algorithms and insights into IR ...
متن کاملBehavioral Considerations in Developing Web Information Systems: User-centered Design Agenda
The current paper explores designing a web information retrieval system regarding the searching behavior of users in real and everyday life. Designing an information system that is closely linked to human behavior is equally important for providers and the end users. From an Information Science point of view, four approaches in designing information retrieval systems were identified as system-...
متن کاملWHOSE - A Tool for Whole-Session Analysis in IIR
One of the main challenges in Interactive Information Retrieval (IIR) evaluation is the development and application of re-usable tools that allow researchers to analyze search behavior of real users in different environments and different domains, but with comparable results. Furthermore, IIR recently focuses more on the analysis of whole sessions, which includes all user interactions that are ...
متن کاملContextual Simulations for Information Retrieval Evaluation
Non-interactive evaluations of Information Retrieval (IR) systems do not model many of the contextual factors that influence real users’ information seeking. As such, they may give overlysimplified grounds for IR system comparison. This paper advocates the use of rich contextual simulations (i.e., simulations of user behavior and the factors that influence it) to extend and enhance the non-inte...
متن کاملMedian measure: an approach to IR systems evaluation
In this paper we report results from three studies examining 1295 relevance judgments by 36 IR system end-users. We examined both the region of the relevance judgment, from non-relevant to highly relevant, and motivations or levels of their relevance judgments. Our study has three major findings. First, the frequency distributions of relevance judgments by IR system end-users tend to take on a ...
متن کامل